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Abstract 

Dai, Li, and Wu proposed Rule k, a localized approximation algorithm 
that attempts to find a small connected dominating set in a graph. Here 
we consider the "average case" performance of Rule k for the model of 
random unit disk graphs constructed from n random points in an £ n x i n 
square. If k > 3 and £ n = o(y / n), then the expected size of the Rule k 
dominating set is as n — ► oo. If l n < y/ 10 £ n , then expected size 

of the minimum CDS is also O(^). 

keywords and phrases: dominating set, localized algorithm, approxima- 
tion algorithm, performance analysis, probabilistic analysis, Rule k, unit 
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1 Introduction 



In this paper we consider the problem of finding a small connected dominating 
set for a unit disk graph G = (V, E), where the vertex set, V , is a set of points 
in 5ft 2 . Given the vertex set V, the edge set E is determined as follows: an 
undirected edge e € E connects vertices u,v £ V (and in this case we say that u 
and v are adjacent) iff the Euclidean distance between them is less than or equal 
to one. Unit disk graphs have been used by many authors as simplified math- 
ematical models for the interconnections between hosts in a wireless network, 
and random unit disk graphs have been used as stochastic models for these net- 
works, e.g. |H],E3|, EI] El 123] We particularly mention the work of the 
Hipercom Project, e.g. EBMHIj because it is closely related to our work. 

A dominating set in any graph G — (V, E) is a subset C C V such that every 
vertex v G V either is in the set C, or is adjacent to a vertex in C. We say C is a 
connected dominating set if C is a dominating set and the subgraph induced by 
C is connected. Obviously G cannot have a connected dominating set if G itself 
is not connected. We use the acronym "CDS" for a dominating set C such that 
the subgraph induced by C has the same number of components as G has. In 
this paper we consider a random unit disk graph model, Q ni which is connected 
with asymptotic probability one. So, in this case, any CDS for Q n will also be 
connected with high probability. 

The identification of a small connected dominating set for the graph which 
represents the network is an important step in several routing methods. The 
general idea of CDS-based algorithms is to select a small CDS, and have only 
those nodes responsible for determining routes > [M] , EH , EH- It is believed 
that, by reducing the number of such nodes, CDS-based algorithms reduce in- 
terference between transmitters in the same region and alleviate a related set 
of problems known collectively as "broadcast storm" Furthermore the cost 
of finding and maintaining routing information is smaller because fewer nodes 
are involved. However it is beyond the scope of this paper to consider direct 
measures of the utility of a small CDS after it has been found. In this paper we 
consider only a single measure of the algorithms' effectiveness, namely the size 
of the CDS it finds. 

Even with this simple measure of performance, there are non-trivial algorith- 
mic and analytical problems. It is an NP-hard computational problem to find 
the minimal connected dominating set in a unit disk graph | 2()| . Hence there is 
considerable practical interest in designing good approximation algorithms for 
finding small connected dominating sets. See, for example [2] , , El > EH > EH , ED • 
There have been various efforts to evaluate CDS algorithm's average case perfor- 
mance using simulations. However, with the exception of the theoretical parts 
of |U,[Ill,[ini, we are not aware of any probabilistic analysis that is proved 
mathematically. 

In this paper we analyze 'Rule fc' (k > 3), a family of localized approximation 
algorithms proposed by Dai, Li, and Wu ^JjEH]- F° r each k, Rule k attempts 
to find a small CDS. We first choose an appropriate probability model. Then, in 
the context of the model, we prove explicit asymptotic bounds on the expected 
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size of the dominating set that Rule k produces. Thus our contribution is not the 
algorithm itself, but rather a mathematically sound analysis of the algorithm. 

Before describing Rule k, we introduce some notation. We assume that each 
vertex has a unique identifier taken from a totally ordered set. For convenience, 
when |V| = n, we will use the numbers 1,2, ... ,n as IDs, and will number the 
vertices accordingly. If x t is any vertex, with ID given by i, let ... let N(xi) 
be the set consisting of Xi and any vertices that are adjacent to Xi. The CDS 
constructed by the Rule k algorithm is denoted Ck(V), and its cardinality is 
Cfc(V) = |C fe (V)|. The elements of C k (V) are called "gateway nodes". C k (V) 
consists of all vertices Xi € V that are not excluded under the following version 
of Rule k: 

Rule k: Vertex Xi is excluded from Ck(V) iff N(xi) contains at least one set 
of k vertices x^ , Xi 2 , . . . Xi k such that 

• i\ > 12 > ■ ■ ■ > ik > i, and 

• The subgraph induced by {x^ ,x i2 ,---, Xi k } is connected, and 

• N( Xi )C U N(x it ). 

t=i 

Wu Li and Dai proved that Ck(V) is a CDS, and they conjectured that the Rule 
k dominating set is, in some sense, small on average. The main result in this 
paper is a proof of their conjecture. 

The rest of this paper is organized as follows. In the next section we specify 
the model and define the random unit disk graph, Q n . In Sections 3 we prove 
a local coverage theorem that is needed in section 4 to prove an upper bound 
for E(Ck{V)) Finally, in the remainder of the paper, we discuss lower bounds 
and optimality issues. The appendix deals with a related algorithm called the 
Marking Process. 

2 Choice of Models 

Before estimating the expected size of the Rule k dominating set, we must spec- 
ify the underlying probability model. For any real number £ > 1, let Q{£) be 
an £ x £ square in 3? 2 . The particular choice of a square will be immaterial, 
but its size will be very important. Let Q. n ,i — Q(£) x Q(£) x ••■ x Q(£) be 
the n- fold product space with the usual product topology. For each n > 1, let 
-Xn,f,i, ^n,£,2, ■ ■ • i X n ^, n be & sequence of random points selected independently 
from a uniform distribution on Q(£) and let P n ,i denote the uniform probabil- 
ity measure on tt n j induced by the random variables X n> i t \, X n ^^2, ■ ■ ■ ,X n j^ n . 
Finally, let Q(n,£) be the random unit disk graph with vertex set V n ^ = 
{X n ,e,i,X ni e t 2, X n j^ n } that is formed from these vertices by putting an edge 
between two vertices iff the Euclidean distance between the two vertices is less 
than or equal to one. 
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We want to estimate the "average" size of Ck(V n ,e) for large networks. As it 
stands, the expected value E n ^(Ck) [= E(Ck(V n ^))] is defined with respect to 
the probability measure P n j on f2„^ and depends on both n and £. We shall 
not however attempt any multivariate asymptotic estimates. Instead, we choose 
a suitable sequence, (£n)%Lii an d consider the expected value E n: e n (Ck) with 
respect to P„^ as n — > oo. To simplify notation throughout, we will (usually) 
suppress the dependence on the choice of a sequence (£ n )^ =1 . Thus we write Q n 
instead of G(n, £ n ), and write E n {Ck) instead of E n ^ n (Cfe). Suppressing even n, 
we write Q instead of Q(£ n ), and P instead of P n ,t n - 

Conditions on the growth rate of £ n will be clear from the statements of 
theorems. However, to provide some perspective on our choice of growth rates 
for £ n , we mention that it is known that the threshold for connectivity is £ n — 
0(y / n/ logn); if l n grows faster than this, then the random unit disk graph Q n 
will be disconnected with probability 1 — o(l) as n — > oo. In this case, with 
high probability, Ck{V n ,e) will not be a connected dominating set for Q n . More 
precise versions of these remarks are provided in the new book by Penrose|25| 
which gives an up to date survey of random geometric graphs. 

Finally, throughout the remainder of this paper we adopt the following no- 
tation. For any points p and q in 5ft 2 , let d(p, q) denote the ordinary Euclidean 
distance between p and q in 5ft 2 . 



3 Local Coverage by k vertices 

The next lemma is a purely geometric result which we require for the proof of 
Theorem 2. To state the lemma, we need some notation. Let 5 = | — = 
.0669 . . . , and define p by p + 25 = 1: p = ^ = .866 .... Let p be any point in 
Q, and let D = Di(p)f]Q be the set of points in the square Q whose distance 
from p is one or less. 

Lemma 1 There exist points Zq, Z\, z 2 G D such that the following two condi- 
tions are satisfied: 

• for s — and s = 1, d(z Sl 2 s +i) < 1 — 25 

• D' C (J D p (z s ). 

Proof: Consider first the case where Di(p) C Q, i.e. p is a point that is not 
near the boundary of the square. We may, without loss of generality, choose the 
coordinate system such that p = (0, 0) and such that the axes are parallel to the 
sides of the square Q. For s = 0, 1,2, let S s be the sector of D\{p) consisting 
of those points whose polar coordinates (r, 9) satisfy r < 1 and ( 2a ~ 1 ) 7r < < 
(2s+i . Let z s be the point in S s whose polar coordinates are 2 ip)- Then the 
first condition is satisfied: d(z s , z s +i) = sin j = 1 — 25. It is also straightforward 
to check that for s — 0, 1, 2, S s C D p (z s ) and so the second condition is satisfied. 
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Now consider the remaining case where D\{p) meets the boundary of Q. 

2 

Choose points Za,Zi,Z2 as before so that D\{p) C (J D p (z s ) and d(z s , 2 S 4-i) < 

1 — 2(5. We are not done because one or more of the points z s may not lie in Q. 
In particular, if z s £ Q, then there is a (unique) z' s £ Q such that d(z s ,z' s ) — 
inf{d(z s ,z) : z S Q}. We replace z s by z' s and observe that every point of D 
is closer to z s than it is to the original point z s . Hence S s f] Q C D p (z s ). After 
replacing alLz s such that z s ^ Q by the corresponding we obtain three points 
that satisfy the conditions of the lemma. □ 

Fix k > 3, the k in "Rule k". Suppose m points Px,Pz, . . . ,P m are selected 
independently and uniform randomly in D (p). Let K, m be the event that, for 
some 1 < iq < i\ < %2 < ■ ■ ■ < ih—i < m : we have: 
fc-l 

• D C |J Di(P is ), and 

• the unit disk graph with vertices Pi a , P, x , . . . , Pi k _ 1 is connected. 

We note that event K, m implies that the random unit disk graph which is formed 
from the vertices Pi,P 2 , ■■■,P m has a fc-point connected dominatng set. With 
this notation we can state 

Theorem 2 There is a positive constant a < 1 and a positive constant m,k such 
that, for all m > m^, Pr(/C m ) > 1 — 4a m . 

Proof: Choose points Zq,Z\, z-i as in the proof of Lemma ^ If z is any point in 
D$(z s ), then for all y € S s , d(z,y) < d(z,z s ) + d(z s ,y) < S + p < 1. Let £ s be 
the event that none of the m random points Pi, P2, . . . , P m lies in Dg(z s ). Then 

/ Are<D^)nQ)\ m 

Note that Axe&(D s (z a ) f] Q) > \Kma,{D s (z s )) = and that Area(L>') < 

4 

Pr(£ s ) < a m . (2) 



Area(£>i(p)) = w. If we let a = 1 - ^ = .998 . . ., then a < 1, and for s = 0, 1, 2, 



It follows from© that Pr(/C m ) > 1 - 3a m since n E{ flf 2 c C £ m , and the 
proof is complete if k = 3. 

Now suppose that k > 3, and let Y" be the number of the m random points 

2 

that lie in Z3 p (z 2 ) fl 2- Since { y > fc l n ( H ^) ^ ^m, we have 

2 

Pr(/C m ) > 1 - Pr( |J £,) - Pr(T < k). (3) 

But Y has a binomial {m,p) distribution, where 

. _ Area(£ p (z s ) n g) 7rp 2 /4 = £ 



Area(D') n 16 
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Hence, for all m > k, 

k-l 



Pr(y <k) = J2( m )p J (i-P) 



(5) 



<m k (i-p) m - k <( 1 -^) k -( 1 ^r. (6) 

Since || < a, it follows that, as m — -> oo, 

Pr(F < k) = o{a m ). (7) 

Put©,©, and J7J together to conclude: there is a positive constant mu such 
that, for all m > m&, 

Pr(K£,) < 4a m . (8) 

□ 



4 Analysis of Rule k 

In this section, we assume that £ n = o(y / n) as n — > oo. Also, in this section, 
let Uk = Ylj—i li be a sum of indicator variables where U — 1 iff node i is not 
included in Ck(V) under Rule k. Thus Rule k selects a dominating set Ck(V) 
having Ck(V) = n — Ut vertices, and it is desirable for Uk to be large. Our goal 
in this section is to prove that, for all k > 2, E(Uk) > n— 0(£%i). 

Let A n = n — £„, and let let X\,X2, X n be independent, uniformly dis- 
tributed random points in Q, namely the locations of vertices. (Here we are 
again simplifying notation by writing instead of X n j n j.) Let pi be the num- 
ber of neighbors of vertex i having a larger ID, i.e. the number of j > i such 
that d(Xi,Xj) < 1. 

Lemma 3 If l n = o(y / n), then 

_/ (n — i)ir\ ,—(n — i)ir. 

Proof: Let |Di(Xj)| = Area(Di(Xi) f] Q) be the area of the set of points in Q 
whose distance from Xi is one or less. Thus = tt unless X{ happens to 

fall near the border, and in all cases > ?. Given the variable 

Pi has a Binomial(n — i, ^ff^- ) distribution. Therefore Chernoff's bound on 
the lower tail distribution gives 

\Di(X t )\) = 



( in — i 



P [Pi < 



\Di(Xi)\ 



iDijXJKn - i) 

II 
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< exp(^-(l - 

< exp 
□ 

Theorem 4 If k > 2, then E n {C k ) = 0{l\) 

n—i 



iD^X^n-i) 



(n — i)7r x 
32£2 



Proof: Let Bj be the event that pi > H p V . By Lemma |21 



P(J« = 1) > P(I, = l\Bi)P(Bi) > P(I< = l|B,)(l - oxp( ( " J )7T ) ) (!.) 



32£l 

Now suppose that i < X n = n — £^ , and observe that 

P(Ji = l|Bi)= ]T P(J, = l|pi = «)P(ft = v\Bi). (10) 



— sf£ 

To estimate this, observe that 

p(j 4 = % = v ) = J p (^ = lift = = fyfxMpi = v ) d * (ii) 

Q 

where /x 4 (a?|pi = v) is the conditional density of Xi on the square Q given that 
p,: = u. For v > (n — i)ir/8i^, Theorem yields 

P(J ( = l\(H = v,X l =x)>l-Aa v >\~ Aa {n - l)7T/u ~. (12) 
Putting this back into (|11|) and then l]lUp. we get 

P(/i - l|fli) > 1 - 4a("- J ^/ 8£ ", (13) 

and therefore 

P(J 4 = 1) > P(/i - l|Bi)P(Bi) > (l-4a("- J )-/<)(l-exp(-^^)) (14) 

M-^/sft.^.^). (15 ) 

Recall that A„ = n — £\, and that the foregoing estimates were valid for all 
i < X n . Putting j = n — i, we get 

E(U k ) > £p(J, = 1) = E(l - 4a(-)-/< - exp(-^g^)) (16) 

i=l z=l ^ n / 

> A„ - 4 £ - £ (exp(-7T/32^)) j (17) 

= n - 0(O- (18) 

□ 
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5 Lower Bound 

If a vertex v has higher ID than any of its neighbors, then it cannot be eliminated 
under Rule k. This simple observation is the basis for 

Theorem 5 If £ n = o( v / n), then, for all sufficiently large n, the expected size 
of the Rule k dominating set is more than III '4. 

n 

Proof: Let = ^2 Ii, where 7j = 1 iff node i has a higher ID that all the 

»=i 

nodes in Di(Xi). Note that I{ = 1 iff the nodes Xi+i, X,+2, • • • , X n all fall 
outside the disk D\(X{). Therefore 

P(/< = 1) = (1 - ^ili > (1 - J)- 1 (19) 

Therefore 

S(L„) > ^(1 - £)""' = ^(1 - (1 - £)») = -£(1 - (1)). (20) 

i=l « t n " 

□ 



6 Optimality 

For this section, l n < ~- 1 " g n , where a is a constant greater than 9. It is 

easy to verify that, with asymptotic probability one, there exists a CDS, C ran d, 
having 0(£ n ) vertices: simply partition the square Q into [3^nJ 2 equal-sized 
squares, each with sides of length s n = -pf-j = \ + 0( j-), and then pick one 
node from each of these small squares. More explicitly, for < i,j < [3£ n \, 
let Qij = {(x,y) : is n < x < (i + l)s„ and js n < x < (j + l)s„}. Let B be 
the event that each of the [3^ n J 2 small squares contains one or more nodes. By 
Boole's inequality, 

P(B C ) < <P(Q M is empty) = <(1 - ^JL_)» (21) 

= <exp(-^(l + 0(l)) (22) 

< J}—e~ Xosn = 0{-^—). (23) 
log n log n 

Now given the vertices V = {X\, X 2 , X n }, we construct C ran d Q V as follows: 
For each 1 < i,j < [3£ n \, if Qij contains at least one vertex, then select one 
vertex Vij uniform randomly from among the vetices in Qij, and include Vij 
in C r and- Thus C r and is a (random) set of at most [3^ n J 2 nodes. It can contain 
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fewer nodes (possibly as few as one), but with asymptotic probability 1, C ran d 
contains exactly [3^nJ 2 vetices and is a CDS. 

It is worth pointing out that this existence argument cannot be used in a 
straight-forward way as the basis for a localized algorithm because the nodes 
do not know their own locations in the network. One of the main advantages of 
the Rule k algorithm is that a vertex makes its decision based on very limited 
information, namely its list of neighbors and their lists of neighbors. 

Nevertheless, the existence argument is useful for us because it leads to 
a lower bound the size that a CDS can have. The following argument was 
influenced by [21] • The appendix of JT] is also pertinent, but we do not see 
how to turn the discussion there into a mathematically rigorous proof. 

Theorem 6 below is based on from the following observation: If v is any 
point in Q, then at most 81 nodes of C ran d are in Di(v). In particular, if C opt 
is a minimum sized CDS, and v is a node in C op t, then N(v) includes at most 
81 nodes of C ran d- But C op t is a dominating set; therefore every node in C ra nd 
must be in N(v) for at least one v € C op t- We therefore have a lower bound of 
the size of C op t : 

\C op t\ > —\C r and\- (24) 
ol 

Combining iJUJl with l(25|). we get 

Theorem 6 Suppose a > 9, and £ n < ,1 ~- ; " g n for all n. Then there is a 
constant B > such that, for all n > 1, 

P^(|C opt |<^)<^. 

Corollary 7 E{\C opt \) = 0(#) 
Proof: From (|24|) . we have 

E{\C opt \) > ±rE(\C rand \) 

ol 

>^P(|CV.nd|=L3#j)-L3#J 

= sr( 1 -°(A:))^nJ=0(4)- 

81 log n 

□ 



7 Discussion 

In this paper we have analyzed Rule k only for k > 2. For k < 3, the analysis is 
different and quite a bit more complicated. The analysis for that case is treated 
in a subsequent paper. Also, here we have only analyzed the application of Rule 
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k on the entire vertex set of Q n . Clearly Rule k could also be used in conjunction 
with other heuristics in order to construct a "small" CDS. For example, Wu and 
Li have proposed the "Marking Process", an algorithm for selecting an initial 
CDS A4. They recommended that the Marking Process be followed by Rules 1 
and 2. Dai and Wu subsequently proposed the more general Rule k. The various 
Rules 1,2,3,... can be applied one after the other up to some largest k. Dai Li 
and Wu mark the nodes in the CDS, and with each new rule application, the 
set of marked nodes shrinks. 

In the case where l n — > oo and £ n < y/n/Z logn, it can be shown (see 
Appendix 1) that asymptotically nothing is gained by applying the Marking 
Process before applying Rule k. It may be possible to obtain further reductions 
in the size of the heuristic CDS by successive applications of Rules 1,2, ..,k 
as proposed by Dai Li and Wu. However, the rigorous analysis of the Dai Li 
and Wu heuristic is complicated due to dependence between the variables at 
the various stages in the analysis of the heuristic. For example, it seems much 
harder to estimate E(\Ck+i{Ck(V))\) than it is to estimate i?(|Cfc+i(V)|) (say). 
Our analysis only considered E(\Ck(V)\) for any fixed k > 2 and we have shown 
that in this case the average size of Ck(V) is of the same order as the size of the 
optimal CDS. So, even a simple application of Rule k to the entire vertex set V 
produces, on average, a "good" CDS. 
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Appendix 1: The Marking Process 

Wu and Li [2Hj proposed the following method for selecting an initial CDS M. : 

Marking Process: a node is included in M. iff it has two neighbors that are not 
adjacent (i.e. not directly connected by and edge). 

Suppose we apply the Marking Process to the random graph Q n . Let M = \M\ 
be the number of vertices marked by the marking process. In this appendix, 
let Ii = 1 iff the ith vertex gets marked, i.e. vertex i has two non-adjacent 

n 

neighbors. Let Ii — otherwise. Thus M = Yl Ii is the number of marked 

i=l 

vertices. Our goal is to establish the following asymptotic estimate for the 
expected value of M. 

Theorem 8 E(M) = n - 0(nexp(-^-)). 

Proof: Since the i^'s are identically distributed, we have 

E(M) = nP(h = 1). (25) 

It therefore suffices to prove that P(/i = 0) = 0(exp(— -p-)) . For any i, and 
any r > 0, let D r (i) be the disk of radius r centered at the vertex labelled i. 
If vertex 1 happens to fall near the boundary of Q, then it may happen that 
part of -Di(l) is not entirely contained in Q. But in any case we can partition 
Z?i(l) into four quarter disks and select one of the four quarter disks K in such 
a way K is contained in Q. If ip is the axis of symmetry of K, let B\ be the set 
of points in K whose distance from <p is greater than \. Note that B\ consists 
of two disjoint components , B^ , and that the distance from B^ to £?f is 1. 
Hence vertex 1 will be marked if both B^ and £?f contain at least one of the 
other n — 1 vertices. Define B\ to be the event that both B^ and B^ contain 
at least one of the other n — 1 vertices. In this section only, define a to be the 
area of B^ . The probability that B^ contains none of the other n — 1 nodes is 

£ 2 —a 

( *fcr^ ) n ~ ■ The same is true of £?f . Hence 

P(/i = 0) < 2(^— -r- 1 = 0{e- n /^). (26) 

□ 

Corollary 9 P(M ^ n) = 0{ne' n /^). 

Proof: By Boole's inequality, 

P(M ^ n) = P{Ii = for some i) < nP(h = 0) = 0(ne~ n / e ^). □ 

Now fix k > 2, and let = |Cfc(F)| be the number of vertices in the CDS 
which is constructed when Rule k is applied to all nodes in the network. Let 
C k = |Cfc(.M)| be the number of vertices in the CDS which is constructed when 
Rule k is applied to M. = the nodes marked by the marking process. Provided 
£ n — > oo and £ n < ^/n/3\ogn, the two quantities rarely differ, so we have the 
following corollary to Theorem |H1 
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Corollary 10 E(C' k ) > E{C k ) - 0(n 2 e -™/ £ "). 
Proof: 

E(C' k ) >E(C' k \M = n)~P n (M = n). (27) 

If M = n, i.e. if M. = V and all nodes in the network are marked, then 
C k (V) = C k {M). Therefore E{C' k \M = n) = E{C k \M = n), and 

E{C k \M = n)P{M = n) = E(C k ) - E{C k \M ± n)P(M ^ n). (28) 

Combining (27J| with JUJ), we get 

E{C k ) > E{C k ) - E(C k \M + n)P(M ± n) 

> E(C k ) - nP{M ^n)> E(C k ) - 0{n 2 e - n/i "). 

□ 
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